Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch stop/start/restart #1990

Merged
merged 23 commits into from
Aug 6, 2018
Merged

Elasticsearch stop/start/restart #1990

merged 23 commits into from
Aug 6, 2018

Conversation

pr33thi
Copy link
Contributor

@pr33thi pr33thi commented Jul 31, 2018

Copy link
Member

@dannyroberts dannyroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this has the major components in place and needs some testing and tweaking. Other notes

  • .travis/secrets.tar.enc change shouldn't be in this PR
  • We want to get rid of restart-elasticsearch, and I think there's no reason not to do it in the same PR and just make sure it's well tested

@@ -6,26 +6,46 @@
retries: 20
delay: 3
changed_when: result.stdout.find('"acknowledged":true') != -1
tags: action_stop

- debug: msg="{{ result }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to keep this debug output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


- name: get es instances to kill
shell: "ps aux | pgrep 'elasticsearc[h]' | awk '{print $2}'"
register: es_instances
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might call this es_pids rather than es_instances since that's more specifically what they are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


- name: kill elasticsearch instances
shell: "pkill {{item}}"
with_items: es_instances
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I would have though that you would have to use es_instances.stdout.splitlines() or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if action == 'status':
return ElasticsearchClassic(self.environment, self.ansible_context).execute_action(action, host_pattern, process_pattern)
else:
response = six.moves.input("This function does more than stop and start the elasticsearch service. For that, use elasticsearch-classic."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: these lines are long. I'd suggest just

response = six.moves.input(
    "this style"
    "\nof indent"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"\nStop will: stop pillows, stop es, and kill -9 if any processes still exist after a period of time. "
"\nStart will start pillows and start elasticsearch "
"\nRestart is a stop followed by a start.\n Continue? [y/n]: ")
if response.lower() == "n":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use ask for this the way other commands in commcare-cloud do for a consistent experience (and consistent code). E.g.

if not ask('Have you stopped all the elastic pillows?', strict=True, quiet=args.quiet):
. (ask is just a little utility function we wrote, not a common python thing.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know
bed0323

if not exit_code == 0:
print("ERROR while starting pillows. Exiting.")
sys.exit(1)
self._run_rolling_restart_yml(tags='action_start')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like elasticsearch is being started after pillows are started if i'm reading this correctly. we probably want to start elasticsearch first, and then start pillows

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we want to do it in that order?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most pillows populate data into elasticsearch, so for the time between the pillows starting and es starting, the pillows will just be creating errors. they will retry eventually but there's no point to starting a service if the service it depends on is down

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, done here: 9a8c2ae
(also refactored the pillow start/stop code in there)

@dannyroberts
Copy link
Member

This is looking great! I think next step is to run on staging (if you haven't) in a few different ways (restart, stop, start, etc.) and tell us the outcome. Then I think it's good to merge!

def _act_on_pillows(self, action):
# Used to stop or start pillows
ansible_context = AnsibleContext(None)
service = SERVICES_BY_NAME['pillowtop'](self.environment, ansible_context)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to reference it directly rather than looking it up in the dict?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, ba1d401

@pr33thi
Copy link
Contributor Author

pr33thi commented Aug 6, 2018

Just tested it on staging and confirms that it works as expected. Will merge after the tests pass unless anyone has any objections!

@pr33thi pr33thi merged commit 62ad3bd into master Aug 6, 2018
@pr33thi pr33thi deleted the pv/es-2 branch August 6, 2018 19:23
@dannyroberts
Copy link
Member

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants